Skip to content

Conversation

WorldofJARcraft
Copy link
Contributor

@WorldofJARcraft WorldofJARcraft commented Aug 27, 2025

This commit adds support for the Xilinx AXI Ethernet Lite device, also known as the emaclite.
The emaclite is a light-weight 10/100 MII Ethernet device commonly found in Xilinx FPGA / SoC designs, as it does not require a separate license.
It foregoes a DMA to reduce chip area. Instead, it uses memory-mapped transmit/receive buffers.
A selection of features can optionally be enabled:

  • a second ("pong") RX/TX Buffer
  • MDIO support
  • Interrupt support

This driver handles the MAC functionality of the core; a driver for the MDIO part is introduced separately as an MDIO driver.

Documentation for the device is available online.
This driver has been tested in an out-of-tree variant of the OpenHW Group cva6 SoC (RISC-V 64) on a Digilent Arty A7 board. In particular, the most commonly used version with all optional features enabled has been tested.

Here is an exemplary device tree entry for this device:

axi-ethernet-lite@40E00000{

		/* shared between MAC and MDIO */
		reg = <0x40E00000 0x10000>;

               compatible = "xlnx,xps-ethernetlite-3.00.a";

		status = "okay";

		arty_ethernet:axi-ethernet-lite-mac {
			/* MAC and MDIO cannot be parent and child, this breaks the preprocessing macros */
			compatible = "xlnx,xps-ethernetlite-3.00.a-mac";
			interrupt-parent = <&plic>;
			interrupts = <12 0x4>;
			clocks = <&clk_bus_0>;
      	
			local-mac-address = [00 0a 35 00 00 01];
			
			xlnx,rx-ping-pong;
			xlnx,tx-ping-pong;
			phy-connection-type = "mii";

			status = "okay";
		};


		axi_ethernetlite_0_mdio: mdio {
			#address-cells = <1>;
			#size-cells = <0>;

			compatible = "xlnx,xps-ethernetlite-3.00.a-mdio";
			status = "okay";

			arty_phy0: phy@1 {
				#address-cells = <1>;
				#size-cells = <0>;
				compatible = "ethernet-phy";
				reg = <1>;
				status="okay";
    			};
	  	};
	};

@WorldofJARcraft WorldofJARcraft force-pushed the add_xilinx_axi_ethernet_lite branch 2 times, most recently from affe6b3 to 491c31a Compare August 28, 2025 15:45
@WorldofJARcraft WorldofJARcraft marked this pull request as ready for review August 28, 2025 16:20
@WorldofJARcraft
Copy link
Contributor Author

FYI @venodela

@WorldofJARcraft WorldofJARcraft changed the title Add xilinx axi ethernet lite Add Xilinx Axi Ethernet Lite Aug 28, 2025
Comment on lines 175 to 158
if (mdio_xilinx_axi_ethernet_lite_check_busy(config)) {
LOG_ERR("MDIO bus busy!");
return -ENOSYS;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be blocking

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This serves as an error check: In case the device is completely stuck, there is no point in starting a new transaction. So we throw an error in this case.
If the device is not stuck, the device should never be busy when read/write are called, as they block until the transaction completes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well if you added a mutex or a semaphore, that would also already stop hinder starting another transfer, while another is ongoing, as it would only be cleared after it finishes. Also how could a mdio controller get stuck? with fixed timings

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Transactions can time out, which releases the mutex.
The busy state can then be encountered by the next - if any - transaction.

Generally, the core is not supposed to get stuck in an MDIO transaction. I only expect this to happen, e.g., when it is not correctly wired up (this is commonly used in FPGA designs, so this can happen).
I figured printing a clear error message here would help the FPGA engineer notice the issue.

@WorldofJARcraft WorldofJARcraft force-pushed the add_xilinx_axi_ethernet_lite branch from 491c31a to ffebac8 Compare August 29, 2025 09:03
@WorldofJARcraft WorldofJARcraft force-pushed the add_xilinx_axi_ethernet_lite branch from ffebac8 to 3cc6a47 Compare August 29, 2025 09:51
Comment on lines 321 to 329
do {
int frag_len = cursor->buf->len;
const uint8_t *frag_data = cursor->buf->data;

xilinx_axi_ethernet_lite_write_transmit_buffer(config, buffer_addr, frag_data,
frag_len);

buffer_addr += frag_len;
} while (xilinx_axi_ethernet_lite_cursor_advance(cursor));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as you are just copying the packet at a specific location, just use net_pkt_read() directly. in https://github.com/zephyrproject-rtos/zephyr/blob/main/drivers/ethernet/eth_litex_liteeth.c it is used similar. You don't have to handle each buffer by its own.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, this is not guaranteed to work for this IP:
net_pkt_read uses memcpy() internally to move data.
memcpy() uses byte-width loads and stores (e.g., lbu, sbu on RISC-V) for the last <sizeof(long) bytes and (in low-footprint configurations) even for the entire copy operation.
Depending on the CPU implementation, this causes narrow AXI bursts to be issued. Narrow bursts are not supported by the IP (see product guide). I am not sure what would happen in this scenario; my best guess is that you will end up with 4 times the last byte instead of the four last bytes in the transmitted packet.

LiteEth does not have the same problem, as it uses a Wishbone bus which does not support narrow bursts.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tried it?

Copy link
Contributor Author

@WorldofJARcraft WorldofJARcraft Sep 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested it right now, replacing only the code on the RX side:

static inline int axi_eth_lite_read_to_pkt(const struct axi_eth_lite_config *config,
					   struct net_pkt *pkt, mem_addr_t buffer_addr,
					   size_t bytes_to_read)
{
	return net_pkt_write(pkt, (void*)buffer_addr, bytes_to_read);
}

The original function was:

static inline int axi_eth_lite_read_to_pkt(const struct axi_eth_lite_config *config,
					   struct net_pkt *pkt, mem_addr_t buffer_addr,
					   size_t bytes_to_read)
{
	int ret = 0;

	for (size_t read_bytes = 0; read_bytes < bytes_to_read; read_bytes += sizeof(uint32_t)) {
		uint32_t current_data = axi_eth_lite_read_reg(config, buffer_addr);
		size_t bytes_to_write_now = read_bytes + sizeof(uint32_t) > bytes_to_read
						    ? bytes_to_read - read_bytes
						    : sizeof(uint32_t);

		ret += net_pkt_write(pkt, &current_data, bytes_to_write_now);

		if (ret < 0) {
			LOG_ERR("Write error bytes %zu/%zu (%zu)", read_bytes, bytes_to_read,
				bytes_to_write_now);
		} else {
			LOG_DBG("Write OK bytes %zu/%zu (%zu) cursor %p remaining %d", read_bytes,
				bytes_to_read, bytes_to_write_now, pkt->cursor.buf,
				pkt->cursor.buf ? pkt->cursor.buf->size - pkt->cursor.buf->len : 0);
		}

		buffer_addr += sizeof(uint32_t);
	}
	return ret;
}

The results were interesting: using net_pkt_write directly instead of in a loop causes no problems for some packets, but breaks reception for other packets:
image
My assumption is that - as I explained above - packets with a length that is a multiple of sizeof(long) like ARP can be copied with no issue, but packets with different lengths cause issues due to byte-wise read and write operations.
Again, the results might be different depending on the implementation of the CPU, bus, minor revisions of the IP, ..., so I would be hesistant to use this approach even if it did work on my machine.
Especially as this can cause reception to fail silently, I believe this is a hard-to-troubleshoot bug waiting to happen.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about rounding the size up to the next value dividable by 4. Saw something like that in eth_smsc911x.c

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would still break if memcpy is implemented something like this:

void *memcpy(void *dst, const void *src, size_t n){
    uint8_t *dst_ptr = dst;
    const uint8_t *src_ptr = src;
    while(n--){
        *dst_ptr++ = *src_ptr++;
    }
    return dst;
}

This is actually the size-optimized implementation of memcpy in picolibc.
Generally, using any wrapper function for copying to/from the FIFO in the device makes assumptions about how the copying is implemented internally, which I would like to avoid.
Again, the AXI bus exacerbates the problem here. Other buses do not behave like this, so it is a none-issue for many other devices.

Comment on lines 410 to 427
hdr = (struct net_eth_hdr *)header_buf;
len = hdr->type;
/*
* AXI Ethernet Lite cannot tell us the length of the received packet, so we try to parse it
* Also, FCS is not used by Zephyr stack
*/
switch (ntohs(len)) {
case NET_ETH_PTYPE_ARP:
/* fixed length */
packet_size =
sizeof(struct net_eth_hdr) + XILINX_AXI_ETHERNET_LITE_ARP_PACKET_LENGTH;
break;
case NET_ETH_PTYPE_IP: {
const struct net_ipv4_hdr *ip4_hdr =
(const struct net_ipv4_hdr *)&header_buf[sizeof(*hdr)];
len = ip4_hdr->len;
packet_size = ntohs(len);
/* length includes ipv4 header length */
packet_size += sizeof(struct net_eth_hdr);
break;
}
case NET_ETH_PTYPE_IPV6: {
const struct net_ipv6_hdr *ip6_hdr =
(const struct net_ipv6_hdr *)&header_buf[sizeof(*hdr)];
/* payload + any optional extension headers */
len = ip6_hdr->len;
packet_size = ntohs(len);
packet_size += sizeof(struct net_eth_hdr) + sizeof(*ip6_hdr);
break;
}
default:
/* use the full MTU... */
break;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the len of a ethernet packet is always at the same location, independent of independent of the protocol (ARP, ipv4, ipv6) in it.
use that.

https://docs.amd.com/r/en-US/pg135-axi-ethernetlite/Type/Length

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This type/length field is a part of the Ethernet MAC header.
This only indicates the packet length for the LLC (link-layer control) L2 protocol, a predecessor of modern Ethernet, which is not commonly used any more.
Modern networks generally use this field as a Type indicator (EtherType) instead.
For example, zephyr's network stack always interprets this as a type instead of a length, which is always the same for each protocol irrespective of the packet length.

For that reason, the IP has a separate transmit size MMIO register and does not rely on the type/length field for transmissions.
I have no idea why the same does not exist for RX; this was probably a design oversight.

Comment on lines 347 to 359
static inline int
xilinx_axi_ethernet_lite_read_to_pkt(const struct xilinx_axi_ethernet_lite_config *config,
struct net_pkt *pkt, mem_addr_t buffer_addr,
size_t bytes_to_read)
{
int ret = 0;

for (size_t read_bytes = 0; read_bytes < bytes_to_read; read_bytes += sizeof(uint32_t)) {
uint32_t current_data = xilinx_axi_ethernet_lite_read_reg(config, buffer_addr);
size_t bytes_to_write_now = read_bytes + sizeof(uint32_t) > bytes_to_read
? bytes_to_read - read_bytes
: sizeof(uint32_t);

ret += net_pkt_write(pkt, &current_data, bytes_to_write_now);

if (ret < 0) {
LOG_ERR("Write error bytes %zu/%zu (%zu)", read_bytes, bytes_to_read,
bytes_to_write_now);
} else {
LOG_DBG("Write OK bytes %zu/%zu (%zu) cursor %p remaining %d", read_bytes,
bytes_to_read, bytes_to_write_now, pkt->cursor.buf,
pkt->cursor.buf ? pkt->cursor.buf->size - pkt->cursor.buf->len : 0);
}

buffer_addr += sizeof(uint32_t);
}
return ret;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue with byte-size transfers.

struct k_sem tx_sem;
/* used to trigger the RX ISR from time to time */
struct k_timer rx_timer;
/* used to offload copying an RX packet outside of ISR context */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not copying it in the isr? Would be less overhead

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Workqueues are scheduled. Hence, higher-priority threads can preempt them.
ISRs cannot be preempted.
As copying the packet takes many cycles, and the Ethernet Lite is a low-performance core anyway, I figured it made sense to copy packets in a preemptible context.

@WorldofJARcraft WorldofJARcraft force-pushed the add_xilinx_axi_ethernet_lite branch from 3cc6a47 to 634b7e0 Compare August 29, 2025 14:53
Copy link

depends on DT_HAS_XLNX_XPS_ETHERNETLITE_3_00_A_MAC_ENABLED
depends on LITTLE_ENDIAN # byte order operations might break for BE
help
Enable Xilinx AXI Ethernet Lite MAC driver, commonly found on Xilinx FPGAs and SoCs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect pattern is tab + 2 spaces here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, thanks

default 1
depends on ETH_XILINX_AXI_ETHERNET_LITE
help
Period for periodic RX timer task in AXI Ethernet Lite.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, thanks

@michalsimek
Copy link
Contributor

nit: s/status = "okey";/status = "okay";/

I can't see any reason why new DT binding should be used compare to one which exists for years in Linux/U-Boot.
I can don't think there is a need to introduce -mac compatible string when you already have xlnx,xps-ethernetlite-3.00.a. And then reference back parent for reg property.
The same is with defining new xlnx,xps-ethernetlite-3.00.a-mdio for mdio. When mdio child is present mdio exists and access can be used.

mdio is optional but won't work without actual "mac" that's why can't see any reason why it should be described in a way that they work as two separate entities and also two separate drivers.
From my perspective you should use

ethernet@xxx {
compatible = ...
reg = ...
mdio {
...
};
};

and create only single driver not two.

@ibirnbaum
Copy link
Member

ibirnbaum commented Oct 3, 2025

nit: s/status = "okey";/status = "okay";/

I can't see any reason why new DT binding should be used compare to one which exists for years in Linux/U-Boot. I can don't think there is a need to introduce -mac compatible string when you already have xlnx,xps-ethernetlite-3.00.a. And then reference back parent for reg property. The same is with defining new xlnx,xps-ethernetlite-3.00.a-mdio for mdio. When mdio child is present mdio exists and access can be used.

mdio is optional but won't work without actual "mac" that's why can't see any reason why it should be described in a way that they work as two separate entities and also two separate drivers. From my perspective you should use

ethernet@xxx { compatible = ... reg = ... mdio { ... }; };

and create only single driver not two.

The thing about this is: MDIO doesn't quite work like this in Zephyr, at least not without implications that would also apply to any other MDIO/MAC pair present in the current target system (example: GEM + its integrated MDIO interface toghether with this AXI IP + its integrated MDIO interface, each with a distinct device driver, in the same target platform device tree).

At some point (and I don't know the relevant history behind this concept) someone decided that MDIO and Ethernet MAC should be separate classes of device drivers, thus requiring separate compatibles for the MAC and the MDIO nodes, and that those two classes have distinct initialization priorities. In conjunction with this, it was decided that by default the MDIO class has a higher initialization priority than the Ethernet MAC class. The result being that with an MDIO child node below the MAC node as you suggest (and which I prefer as well, I'll get back to that later), your application will compile but won't link because at the very last moment, a compile-time assertion will trigger, telling you that a child node can't have a higher init priority than its parent.

There's two approaches to handle this:

  • Change the Kconfig values of the two device classes' initialization priorities. This has to be done at the SoC level and has been tested to work, however, this means that for any other MAC/MDIO driver pair present in the system, the order will change as well and the drivers that are not your own might not expect this.
  • The NXP way: there's one NXP MAC driver in tree whose DT layout regarding MAC and MDIO looks just like you suggest. It works without tampering with the init priorities. How did NXP pull that off? They just ignored the Ethernet MAC class init priority in their MAC driver and hardcoded it to 0. Probably not the best approach...

There are some other caveats when it comes to this class separation of MDIO and MAC when sticking to the original init order: in that case, the MDIO driver is the first one to touch the respective controller's register set (if MAC + MDIO share the register space such as in the GEM) and therefore should perform the controller reset and potentially clean up some registers to default values. If the MDIO node is not active, the MAC driver must know this (the first way to do this that comes to mind is to have the config data struct of the MAC driver contain a pointer to the MDIO node or NULL if not enabled) and must perform the reset sequence itself - and vice versa, if MDIO is active, the MAC driver must not do anything that could kill the already initialized MDIO and must stay away from any reset or similar action that might disable MDIO again. This code duplication isn't nice, but after all, there are valid scenarios where an MDIO-capable MAC is used without an attached PHY.

Also, there are risks of misconfiguration due to how the DT data is converted to the devicetree_generated header and is ultimately resolved in the driver code at compile-time, which make the MDIO child node approach more or less mandatory for MACs of which multiple instances exist (GEM0/1/2/3 for example):

  • DT_INST_FOREACH_STATUS_OKAY(n) always starts counting at 0.
  • There's no association with the instance numbers used in the DT
  • Therefore, the information generated at compile-time only matches if either only the first instance of the respective device is enabled or all instances are enabled.
  • If GEM0 is disabled, GEM1 is enabled, GEM2 is disabled and GEM3 is enabled in the ZynqMP for example, then GEM1 will be referred to as gem0 in all generated structs/variables/function names for the xlnx_gem driver, and GEM3 will be referred to as gem1.
  • If the MDIO node is parallel to the MAC node (which prevents the compile time assertion triggering and any undesired effects caused by unsuspecting drivers if the default init priority values are changed), and the driver resolves its MDIO device pointer in its dev_conf struct using DT_NODELABEL(gem##n##_mdio) as value for DEVICE_DT_GET, in the above example, GEM1 will try to work with the MDIO node belonging to GEM0 because in code, GEM1 is gem0, and GEM3 will try to work with the MDIO node belonging to gem1.

So the MDIO as child of MAC approach solves this issue in a failsafe way, but comes with all the issues described above. You can tell that the way things are currently meant to be was designed at a time when no one considered the possibility of having more than one MAC in any target system and/or more than one MDIO node. Until recently, it was not possible to pick a target device in the MDIO shell extension, it always worked with the first MDIO device exclusively.

Unfortunately, the DT-related macro magic in Zephyr is lacking certain resolve operations, such as "give me the node after mine" (would allow hassle-free implementation with MDIO parallel to the MAC if certain conventions are adhered to in the DT) or "give me the first-best child node" (children can only be obtained in bulk as an array or NULL, or by explicit addressing). There are other scenarios that are likely way more difficult to implement than is the case under Linux. At the time being, my approach for the GMII-to-RGMII converter would be:

  • MDIO is child of MAC, init priorities changed for any SoC supporting the GEM, hoping that no one objects to that
  • phy-handle points to the phy device which is a child of the MDIO node
  • GMII-to-RGMII-converter is another child of the MDIO node, parallel to the PHY node
  • The driver has to check for the existance of the converter by getting the bulk list of all children of the MDIO node and checking if there's another device pointer besides to one that is equal to phy-handle.
  • It can only be assumed that this is actually the converter and not some weird mix-up of any kind
  • ... because it's not possible yet to my knowledge to resolve children (or any device) by binding such as "xlnx,gmii-to-rgmii-converter".

It shouldn't be forgotten that in Zephyr, the DT serves as a means of shoe-horning constant values related to the hardware configuration into driver or application source files at compile-time and that the facilities for doing so are probably not close to what Linux is capable of in terms of processing the DT data.

PS the reason why no such issue has ever existed with the GEM driver before is that at the time it was first implemented, MDIO didn't exist in Zephyr yet. I think that aside from the GEM driver, there was only one other driver with integrated MDIO/PHY facilities at the time. The current switch-over is taking place to prevent duplicate PHY driver code and to allow the GEM to use any PHY currently supported by the system.

This commit adds support for the Xilinx AXI Ethernet Lite device,
also known as the emaclite.
The emaclite is a light-weight 10/100 MII Ethernet device.
It can optionally be configured to include an MDIO.
The MMIO interface is controlled via MMIO registers and requires
the software to busy-wait until completion.

Signed-off-by: Eric Ackermann <[email protected]>
This commit adds support for the Xilinx AXI Ethernet Lite device,
also known as the emaclite.
The emaclite is a light-weight 10/100 MII Ethernet device.
It foregoes a DMA to reduce chip area.
Instead, it uses memory-mapped transmit/receive buffers.
A selection of features can optionally be enabled:
- a second ("pong") RX/TX Buffer
- MDIO support
- Interrupt support
This driver handles the MAC functionality of the core;
a driver for the MDIO part is introduced separately
as an MDIO driver.

Signed-off-by: Eric Ackermann <[email protected]>
@WorldofJARcraft WorldofJARcraft force-pushed the add_xilinx_axi_ethernet_lite branch from 154cc24 to f4db426 Compare October 6, 2025 18:21
@WorldofJARcraft
Copy link
Contributor Author

@ibirnbaum thanks for the summary. I would like to add that for the AXI Ethernet Lite (and many other families of Ethernet IP), the MDIO can be configured independent of the MAC or even removed completely at hardware synthesis/elaboration time. For that reason, depending on the device in question, the MDIO might even have separate register ranges or interrupts. For example, Xilinx' AXI Ethernet Subsystem has two interrupts, one for the MDIO and one for optional MAC features, both of which can be disabled at the hardware level.
Hence, to me, it makes sense that the MDIO is a separate device.
Actually, I remember that we had a similar discussion for my other Xilinx ethernet driver, the AXI Ethernet Subsystem (#73986 (comment)) as well. Hence, I oriented myself on how that driver is structured when working on this PR.

I would also like to point out that a more general discussion regarding the MDIO/MAC architecture should probably be had, and a larger refactoring may be necessary.
Discussing this individually for each new driver creates a lot of extra work and has the potential of solving the same problem differently for different devices.

Copy link

sonarqubecloud bot commented Oct 6, 2025

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants